-
Couldn't load subscription status.
- Fork 24
AVX2/Neon: Add native pointwise and pointwise_acc native
#468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
46222 cycles |
47882 cycles |
0.97 |
ML-DSA-44 sign |
132581 cycles |
151579 cycles |
0.87 |
ML-DSA-44 verify |
47863 cycles |
50930 cycles |
0.94 |
ML-DSA-65 keypair |
81125 cycles |
83679 cycles |
0.97 |
ML-DSA-65 sign |
219258 cycles |
249921 cycles |
0.88 |
ML-DSA-65 verify |
80130 cycles |
84582 cycles |
0.95 |
ML-DSA-87 keypair |
132284 cycles |
135655 cycles |
0.98 |
ML-DSA-87 sign |
280799 cycles |
314509 cycles |
0.89 |
ML-DSA-87 verify |
130347 cycles |
136425 cycles |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115046 cycles |
115049 cycles |
1.00 |
ML-DSA-44 sign |
431544 cycles |
431559 cycles |
1.00 |
ML-DSA-44 verify |
122135 cycles |
122151 cycles |
1.00 |
ML-DSA-65 keypair |
197045 cycles |
197046 cycles |
1.00 |
ML-DSA-65 sign |
701180 cycles |
701106 cycles |
1.00 |
ML-DSA-65 verify |
197639 cycles |
197624 cycles |
1.00 |
ML-DSA-87 keypair |
325309 cycles |
325321 cycles |
1.00 |
ML-DSA-87 sign |
884623 cycles |
884597 cycles |
1.00 |
ML-DSA-87 verify |
328854 cycles |
328981 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115170 cycles |
115355 cycles |
1.00 |
ML-DSA-44 sign |
377460 cycles |
380513 cycles |
0.99 |
ML-DSA-44 verify |
120403 cycles |
120663 cycles |
1.00 |
ML-DSA-65 keypair |
199241 cycles |
199958 cycles |
1.00 |
ML-DSA-65 sign |
623626 cycles |
630974 cycles |
0.99 |
ML-DSA-65 verify |
198418 cycles |
199181 cycles |
1.00 |
ML-DSA-87 keypair |
325995 cycles |
327118 cycles |
1.00 |
ML-DSA-87 sign |
792017 cycles |
797677 cycles |
0.99 |
ML-DSA-87 verify |
324893 cycles |
326175 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
213153 cycles |
213177 cycles |
1.00 |
ML-DSA-44 sign |
780479 cycles |
781343 cycles |
1.00 |
ML-DSA-44 verify |
229843 cycles |
230299 cycles |
1.00 |
ML-DSA-65 keypair |
380942 cycles |
380958 cycles |
1.00 |
ML-DSA-65 sign |
1303904 cycles |
1291486 cycles |
1.01 |
ML-DSA-65 verify |
372619 cycles |
372866 cycles |
1.00 |
ML-DSA-87 keypair |
609714 cycles |
608974 cycles |
1.00 |
ML-DSA-87 sign |
1641842 cycles |
1641956 cycles |
1.00 |
ML-DSA-87 verify |
621880 cycles |
621291 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
35330 cycles |
35216 cycles |
1.00 |
ML-DSA-44 sign |
120523 cycles |
125138 cycles |
0.96 |
ML-DSA-44 verify |
38328 cycles |
39121 cycles |
0.98 |
ML-DSA-65 keypair |
62840 cycles |
63454 cycles |
0.99 |
ML-DSA-65 sign |
202193 cycles |
208653 cycles |
0.97 |
ML-DSA-65 verify |
62950 cycles |
64036 cycles |
0.98 |
ML-DSA-87 keypair |
94928 cycles |
96943 cycles |
0.98 |
ML-DSA-87 sign |
234690 cycles |
251543 cycles |
0.93 |
ML-DSA-87 verify |
95237 cycles |
96969 cycles |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
95724 cycles |
95709 cycles |
1.00 |
ML-DSA-44 sign |
345661 cycles |
345734 cycles |
1.00 |
ML-DSA-44 verify |
101364 cycles |
101338 cycles |
1.00 |
ML-DSA-65 keypair |
165054 cycles |
164680 cycles |
1.00 |
ML-DSA-65 sign |
568432 cycles |
568266 cycles |
1.00 |
ML-DSA-65 verify |
165543 cycles |
165518 cycles |
1.00 |
ML-DSA-87 keypair |
270522 cycles |
270264 cycles |
1.00 |
ML-DSA-87 sign |
725298 cycles |
725295 cycles |
1.00 |
ML-DSA-87 verify |
274135 cycles |
273484 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
56763 cycles |
57825 cycles |
0.98 |
ML-DSA-44 sign |
180516 cycles |
190230 cycles |
0.95 |
ML-DSA-44 verify |
61258 cycles |
63174 cycles |
0.97 |
ML-DSA-65 keypair |
100306 cycles |
102038 cycles |
0.98 |
ML-DSA-65 sign |
297435 cycles |
317731 cycles |
0.94 |
ML-DSA-65 verify |
100556 cycles |
104176 cycles |
0.97 |
ML-DSA-87 keypair |
153797 cycles |
157788 cycles |
0.97 |
ML-DSA-87 sign |
353873 cycles |
378266 cycles |
0.94 |
ML-DSA-87 verify |
152631 cycles |
158668 cycles |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
287596 cycles |
288573 cycles |
1.00 |
ML-DSA-44 sign |
928120 cycles |
930921 cycles |
1.00 |
ML-DSA-44 verify |
290687 cycles |
295012 cycles |
0.99 |
ML-DSA-65 keypair |
488563 cycles |
490626 cycles |
1.00 |
ML-DSA-65 sign |
1513930 cycles |
1551572 cycles |
0.98 |
ML-DSA-65 verify |
476293 cycles |
481319 cycles |
0.99 |
ML-DSA-87 keypair |
841378 cycles |
833327 cycles |
1.01 |
ML-DSA-87 sign |
2065613 cycles |
2076843 cycles |
0.99 |
ML-DSA-87 verify |
821369 cycles |
817076 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
69182 cycles |
72965 cycles |
0.95 |
ML-DSA-44 sign |
185561 cycles |
209026 cycles |
0.89 |
ML-DSA-44 verify |
69174 cycles |
74357 cycles |
0.93 |
ML-DSA-65 keypair |
119498 cycles |
122701 cycles |
0.97 |
ML-DSA-65 sign |
296122 cycles |
330324 cycles |
0.90 |
ML-DSA-65 verify |
115326 cycles |
121320 cycles |
0.95 |
ML-DSA-87 keypair |
205751 cycles |
208588 cycles |
0.99 |
ML-DSA-87 sign |
392231 cycles |
432454 cycles |
0.91 |
ML-DSA-87 verify |
196117 cycles |
203545 cycles |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
69390 cycles |
69452 cycles |
1.00 |
ML-DSA-44 sign |
214041 cycles |
214770 cycles |
1.00 |
ML-DSA-44 verify |
72664 cycles |
72508 cycles |
1.00 |
ML-DSA-65 keypair |
123061 cycles |
122780 cycles |
1.00 |
ML-DSA-65 sign |
351927 cycles |
353195 cycles |
1.00 |
ML-DSA-65 verify |
120662 cycles |
120430 cycles |
1.00 |
ML-DSA-87 keypair |
201101 cycles |
200488 cycles |
1.00 |
ML-DSA-87 sign |
450403 cycles |
451124 cycles |
1.00 |
ML-DSA-87 verify |
197903 cycles |
198358 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
157843 cycles |
157874 cycles |
1.00 |
ML-DSA-44 sign |
564586 cycles |
563418 cycles |
1.00 |
ML-DSA-44 verify |
169419 cycles |
169267 cycles |
1.00 |
ML-DSA-65 keypair |
269879 cycles |
269343 cycles |
1.00 |
ML-DSA-65 sign |
926876 cycles |
928710 cycles |
1.00 |
ML-DSA-65 verify |
274388 cycles |
274926 cycles |
1.00 |
ML-DSA-87 keypair |
449964 cycles |
450143 cycles |
1.00 |
ML-DSA-87 sign |
1177029 cycles |
1177838 cycles |
1.00 |
ML-DSA-87 verify |
459130 cycles |
458629 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
136077 cycles |
135835 cycles |
1.00 |
ML-DSA-44 sign |
543045 cycles |
543662 cycles |
1.00 |
ML-DSA-44 verify |
148907 cycles |
148551 cycles |
1.00 |
ML-DSA-65 keypair |
227365 cycles |
227293 cycles |
1.00 |
ML-DSA-65 sign |
882082 cycles |
880495 cycles |
1.00 |
ML-DSA-65 verify |
235758 cycles |
235973 cycles |
1.00 |
ML-DSA-87 keypair |
376752 cycles |
376279 cycles |
1.00 |
ML-DSA-87 sign |
1103368 cycles |
1099997 cycles |
1.00 |
ML-DSA-87 verify |
389430 cycles |
388895 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
42196 cycles |
42495 cycles |
0.99 |
ML-DSA-44 sign |
132313 cycles |
136953 cycles |
0.97 |
ML-DSA-44 verify |
44148 cycles |
45651 cycles |
0.97 |
ML-DSA-65 keypair |
73634 cycles |
73514 cycles |
1.00 |
ML-DSA-65 sign |
212291 cycles |
222961 cycles |
0.95 |
ML-DSA-65 verify |
73489 cycles |
75361 cycles |
0.98 |
ML-DSA-87 keypair |
109492 cycles |
111831 cycles |
0.98 |
ML-DSA-87 sign |
249171 cycles |
264029 cycles |
0.94 |
ML-DSA-87 verify |
109508 cycles |
113897 cycles |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 0241a1e | Previous: a50c7c5 | Ratio |
|---|---|---|---|
ML-DSA-65 keypair |
74894 cycles |
72607 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
132632 cycles |
132765 cycles |
1.00 |
ML-DSA-44 sign |
498393 cycles |
498360 cycles |
1.00 |
ML-DSA-44 verify |
144871 cycles |
144978 cycles |
1.00 |
ML-DSA-65 keypair |
227282 cycles |
227374 cycles |
1.00 |
ML-DSA-65 sign |
814627 cycles |
813162 cycles |
1.00 |
ML-DSA-65 verify |
231687 cycles |
231727 cycles |
1.00 |
ML-DSA-87 keypair |
374382 cycles |
374649 cycles |
1.00 |
ML-DSA-87 sign |
1021450 cycles |
1021467 cycles |
1.00 |
ML-DSA-87 verify |
383912 cycles |
383727 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115227 cycles |
116321 cycles |
0.99 |
ML-DSA-44 sign |
377680 cycles |
382918 cycles |
0.99 |
ML-DSA-44 verify |
120445 cycles |
121647 cycles |
0.99 |
ML-DSA-65 keypair |
199386 cycles |
200280 cycles |
1.00 |
ML-DSA-65 sign |
623881 cycles |
631576 cycles |
0.99 |
ML-DSA-65 verify |
198543 cycles |
199420 cycles |
1.00 |
ML-DSA-87 keypair |
326380 cycles |
327953 cycles |
1.00 |
ML-DSA-87 sign |
792806 cycles |
799114 cycles |
0.99 |
ML-DSA-87 verify |
325168 cycles |
326610 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
73683 cycles |
73940 cycles |
1.00 |
ML-DSA-44 sign |
227417 cycles |
228396 cycles |
1.00 |
ML-DSA-44 verify |
77957 cycles |
78071 cycles |
1.00 |
ML-DSA-65 keypair |
129692 cycles |
129923 cycles |
1.00 |
ML-DSA-65 sign |
376961 cycles |
377186 cycles |
1.00 |
ML-DSA-65 verify |
128989 cycles |
129040 cycles |
1.00 |
ML-DSA-87 keypair |
208475 cycles |
210651 cycles |
0.99 |
ML-DSA-87 sign |
478022 cycles |
478561 cycles |
1.00 |
ML-DSA-87 verify |
208461 cycles |
210198 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
120750 cycles |
120120 cycles |
1.01 |
ML-DSA-44 sign |
452766 cycles |
453371 cycles |
1.00 |
ML-DSA-44 verify |
131683 cycles |
132326 cycles |
1.00 |
ML-DSA-65 keypair |
204985 cycles |
204716 cycles |
1.00 |
ML-DSA-65 sign |
737151 cycles |
737570 cycles |
1.00 |
ML-DSA-65 verify |
210237 cycles |
210009 cycles |
1.00 |
ML-DSA-87 keypair |
338901 cycles |
338444 cycles |
1.00 |
ML-DSA-87 sign |
941134 cycles |
941628 cycles |
1.00 |
ML-DSA-87 verify |
348225 cycles |
349377 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
138577 cycles |
138584 cycles |
1.00 |
ML-DSA-44 sign |
494733 cycles |
495481 cycles |
1.00 |
ML-DSA-44 verify |
148719 cycles |
148760 cycles |
1.00 |
ML-DSA-65 keypair |
241622 cycles |
241312 cycles |
1.00 |
ML-DSA-65 sign |
810227 cycles |
809760 cycles |
1.00 |
ML-DSA-65 verify |
241033 cycles |
240909 cycles |
1.00 |
ML-DSA-87 keypair |
396517 cycles |
396477 cycles |
1.00 |
ML-DSA-87 sign |
1031718 cycles |
1031613 cycles |
1.00 |
ML-DSA-87 verify |
402457 cycles |
402260 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
213591 cycles |
213647 cycles |
1.00 |
ML-DSA-44 sign |
781116 cycles |
794200 cycles |
0.98 |
ML-DSA-44 verify |
230160 cycles |
230157 cycles |
1.00 |
ML-DSA-65 keypair |
381239 cycles |
381964 cycles |
1.00 |
ML-DSA-65 sign |
1287102 cycles |
1286398 cycles |
1.00 |
ML-DSA-65 verify |
373072 cycles |
373972 cycles |
1.00 |
ML-DSA-87 keypair |
609771 cycles |
609842 cycles |
1.00 |
ML-DSA-87 sign |
1644018 cycles |
1645519 cycles |
1.00 |
ML-DSA-87 verify |
623636 cycles |
621691 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
823292 cycles |
822607 cycles |
1.00 |
ML-DSA-44 sign |
3331458 cycles |
3327961 cycles |
1.00 |
ML-DSA-44 verify |
920849 cycles |
919517 cycles |
1.00 |
ML-DSA-65 keypair |
1396888 cycles |
1395382 cycles |
1.00 |
ML-DSA-65 sign |
5441990 cycles |
5429294 cycles |
1.00 |
ML-DSA-65 verify |
1464465 cycles |
1462157 cycles |
1.00 |
ML-DSA-87 keypair |
2301520 cycles |
2304525 cycles |
1.00 |
ML-DSA-87 sign |
6814378 cycles |
6826659 cycles |
1.00 |
ML-DSA-87 verify |
2397724 cycles |
2401170 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
466206 cycles |
465321 cycles |
1.00 |
ML-DSA-44 sign |
2215738 cycles |
2229058 cycles |
0.99 |
ML-DSA-44 verify |
550432 cycles |
547134 cycles |
1.01 |
ML-DSA-65 keypair |
778286 cycles |
779224 cycles |
1.00 |
ML-DSA-65 sign |
3634433 cycles |
3638239 cycles |
1.00 |
ML-DSA-65 verify |
849942 cycles |
850223 cycles |
1.00 |
ML-DSA-87 keypair |
1266642 cycles |
1267615 cycles |
1.00 |
ML-DSA-87 sign |
4543465 cycles |
4512261 cycles |
1.01 |
ML-DSA-87 verify |
1376276 cycles |
1373672 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
229216 cycles |
241131 cycles |
0.95 |
ML-DSA-44 sign |
680980 cycles |
692486 cycles |
0.98 |
ML-DSA-44 verify |
232699 cycles |
231043 cycles |
1.01 |
ML-DSA-65 keypair |
386953 cycles |
393697 cycles |
0.98 |
ML-DSA-65 sign |
1066946 cycles |
1084494 cycles |
0.98 |
ML-DSA-65 verify |
372780 cycles |
375628 cycles |
0.99 |
ML-DSA-87 keypair |
660337 cycles |
671942 cycles |
0.98 |
ML-DSA-87 sign |
1450115 cycles |
1488161 cycles |
0.97 |
ML-DSA-87 verify |
649151 cycles |
646858 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: a41adf6 | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-65 keypair |
411845 cycles |
393697 cycles |
1.05 |
ML-DSA-65 sign |
1135899 cycles |
1084494 cycles |
1.05 |
ML-DSA-65 verify |
389216 cycles |
375628 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
317658 cycles |
301890 cycles |
1.05 |
ML-DSA-44 sign |
1209526 cycles |
1200079 cycles |
1.01 |
ML-DSA-44 verify |
341046 cycles |
332889 cycles |
1.02 |
ML-DSA-65 keypair |
558435 cycles |
568930 cycles |
0.98 |
ML-DSA-65 sign |
1943000 cycles |
1985832 cycles |
0.98 |
ML-DSA-65 verify |
532271 cycles |
539286 cycles |
0.99 |
ML-DSA-87 keypair |
866747 cycles |
863211 cycles |
1.00 |
ML-DSA-87 sign |
2482734 cycles |
2481765 cycles |
1.00 |
ML-DSA-87 verify |
881465 cycles |
888876 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 216176f | Previous: abf8281 | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
317658 cycles |
301890 cycles |
1.05 |
This comment was automatically generated by workflow using github-action-benchmark.
816f665 to
1fda1e9
Compare
38156be to
cdf39ee
Compare
cdf39ee to
a2e40c1
Compare
5276218 to
601f653
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jakemas and @jammychiou1! This is almost looking good to me.
Please fix the comment, and then clean up the commit history.
This will conclude #303 once merged.
Co-authored-by: jammychiou1 <[email protected]> Signed-off-by: Jake Massimo <[email protected]> Signed-off-by: jammychiou1 <[email protected]>
Co-authored-by: jammychiou1 <[email protected]> Signed-off-by: Jake Massimo <[email protected]> Signed-off-by: jammychiou1 <[email protected]>
Signed-off-by: Jake Massimo <[email protected]>
This commit adds a native implementation of poly_pointwise_montgomery written from scratch. Co-authored-by: Matthias J. Kannwischer <[email protected]> Signed-off-by: jammychiou1 <[email protected]>
This commit adds native implementations of polyvecl_pointwise_acc_montgomery written from scratch. Co-authored-by: Matthias J. Kannwischer <[email protected]> Signed-off-by: jammychiou1 <[email protected]>
601f653 to
78a1305
Compare
pointwise and pointwise_acc native
CBMC: polyvec_matrix-expand OBJECT_BITS 10 -> 11 CBMC: polyvecl_pointwise_acc_montgomery OBJECT_BITS 10->12 CBMC: polyvecl_chknorm OBJECT_BITS 9->12 Signed-off-by: Matthias J. Kannwischer <[email protected]>
78a1305 to
216176f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work @jakemas and @jammychiou1!
Looks good to me!
poly_pointwise_montgomeryassembly #337polyvecl_pointwise_acc_montgomery#257poly_pointwise_montgomeryassembly #336poly_pointwise_montgomery#207